Skip to content

Prevent calling stop or restart services during db upgrade#49098

Merged
balloob merged 16 commits intohome-assistant:devfrom
bdraco:block_shutdown_during_db_migration
Apr 13, 2021
Merged

Prevent calling stop or restart services during db upgrade#49098
balloob merged 16 commits intohome-assistant:devfrom
bdraco:block_shutdown_during_db_migration

Conversation

@bdraco
Copy link
Copy Markdown
Member

@bdraco bdraco commented Apr 12, 2021

Breaking change

Prevent the homeassistant.stop and homeassistant.restart services from doing
a shutdown while a background database upgrade is in progress.

While shutdown will block until the database upgrade is completed, the webserver
will shutdown successfully which will leave the user without feedback. Instead we
abort and log an error when they call the service to prevent confusion.

These service calls previously forced blocking to False in #15803 Since we now use asyncio.create_task this is no longer needed and we can now give feedback to the UI / api calls when something fails.

Screen Shot 2021-04-12 at 9 13 54 AM

Screen Shot 2021-04-12 at 9 13 49 AM

Screen Shot 2021-04-12 at 10 16 33 AM

Type of change

  • Dependency upgrade
  • Bugfix (non-breaking change which fixes an issue)
  • New integration (thank you!)
  • New feature (which adds functionality to an existing integration)
  • Breaking change (fix/feature causing existing functionality to break)
  • Code quality improvements to existing code or addition of tests

Example entry for configuration.yaml:

# Example configuration.yaml

Additional information

  • This PR fixes or closes issue: fixes #
  • This PR is related to issue:
  • Link to documentation pull request:

Checklist

  • The code change is tested and works locally.
  • Local tests pass. Your PR cannot be merged unless tests pass
  • There is no commented out code in this PR.
  • I have followed the development checklist
  • The code has been formatted using Black (black --fast homeassistant tests)
  • Tests have been added to verify that the new code works.

If user exposed functionality or configuration variables are added/changed:

If the code communicates with devices, web services, or third-party tools:

  • The manifest file has all fields filled out correctly.
    Updated and included derived files by running: python3 -m script.hassfest.
  • New or updated dependencies have been added to requirements_all.txt.
    Updated by running python3 -m script.gen_requirements_all.
  • Untested files have been added to .coveragerc.

The integration reached or maintains the following Integration Quality Scale:

  • No score or internal
  • 🥈 Silver
  • 🥇 Gold
  • 🏆 Platinum

To help with the load of incoming pull requests:

…grade

While shutdown will block until the database upgrade is completed, the webserver
will shutdown successfully which will leave the user without feedback. Instead we
raise an exception when they call the service to prevent confusion.
@probot-home-assistant
Copy link
Copy Markdown

Hey there @home-assistant/core, mind taking a look at this pull request as its been labeled with an integration (homeassistant) you are listed as a codeowner for? Thanks!
(message by CodeOwnersMention)

Comment thread homeassistant/components/recorder/__init__.py Outdated
self._completed_database_setup = None
self._event_listener = None

self.async_migration_event = asyncio.Event()
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm.. there is a race here since they can call stop before startup is finished. We probably need to set this as soon as we hit the point of checking the future.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A user would have to request a restart between when the hass_started.result call finishes and the variable is swapped. If we set the var before we check the result then they can't stop the instance because something is blocking startup which probably won't work anyways. Probably best to set it before since they upgraded intentionally and we are already at the point of no return.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this event only used in tests?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

@bdraco
Copy link
Copy Markdown
Member Author

bdraco commented Apr 12, 2021

sqlite3 commands to get the database upgrade to take a while for testing

update schema_changes set schema_version=0 where change_id=1;
delete from schema_changes where change_id>=2;
drop index ix_events_context_parent_id;
drop index ix_events_event_type_time_fired;
drop index ix_states_entity_id_last_updated;
drop index ix_states_event_id;
drop index ix_states_last_updated;
drop index ix_events_time_fired;

@bdraco bdraco mentioned this pull request Apr 12, 2021
21 tasks
@bdraco
Copy link
Copy Markdown
Member Author

bdraco commented Apr 12, 2021

2021-04-12 08:17:23 ERROR (MainThread) [homeassistant.helpers.script.websocket_api_script] websocket_api script: Error executing script. Error for call_service at pos 1: The system cannot restart while a database upgrade in progress.
2021-04-12 08:17:23 ERROR (MainThread) [homeassistant.components.websocket_api.http.connection] [140117402835216] Error handling message: The system cannot restart while a database upgrade in progress.

Hmm..I got a log message but not toast

@bdraco
Copy link
Copy Markdown
Member Author

bdraco commented Apr 12, 2021

Hmm calling the service raises an exception on the backend but we get a success message back down the websocket

2021-04-12 08:44:45 ERROR (MainThread) [homeassistant.core] Error executing service: <ServiceCall homeassistant.restart (c:a1c82e4dc4fe1e12e3c75ead770f8332)>
Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/core.py", line 1501, in catch_exceptions
    await coro_or_task
  File "/usr/src/homeassistant/homeassistant/core.py", line 1520, in _execute_service
    await handler.job.target(service_call)
  File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 719, in admin_handler
    await result
  File "/usr/src/homeassistant/homeassistant/components/homeassistant/__init__.py", line 138, in async_handle_core_service
    raise HomeAssistantError(
homeassistant.exceptions.HomeAssistantError: The system cannot restart while a database upgrade in progress.
{"type":"call_service","domain":"homeassistant","service":"restart","service_data":{},"id":27}
{"id": 27, "type": "result", "success": true, "result": {"context": {"id": "a1c82e4dc4fe1e12e3c75ead770f8332", "parent_id": null, "user_id": "c611be8fa223418f95cf18f4981d8b34"}}}

@bdraco
Copy link
Copy Markdown
Member Author

bdraco commented Apr 12, 2021

Well that is because we explicitly do not block on these so there is no way to give feedback

https://github.com/home-assistant/core/blob/dev/homeassistant/components/websocket_api/commands.py#L160

    if msg["domain"] == HASS_DOMAIN and msg["service"] in ["restart", "stop"]:
        blocking = False

That's likely because we don't want to have an outstanding task that blocks shutdown.

Maybe a new service for the frontend homeassistant.can_shutdown that checks config and for migration that does block and then chain a promise to do the stop/restart on success.

@bdraco
Copy link
Copy Markdown
Member Author

bdraco commented Apr 12, 2021

I'm going to sleep on this one

@bdraco bdraco changed the title Raise an exception when calling stop or restart services during db upgrade Prevent calling stop or restart services during db upgrade Apr 12, 2021
@bdraco
Copy link
Copy Markdown
Member Author

bdraco commented Apr 12, 2021

Need to do more archaeology to find out why these are made blocking false forcefully

@bdraco
Copy link
Copy Markdown
Member Author

bdraco commented Apr 12, 2021

Was forced blocking=False for a long time

4de97ab
22a80cf

then

22a80cf

@bdraco
Copy link
Copy Markdown
Member Author

bdraco commented Apr 12, 2021

It was changed in 12e6920 #15803

@bdraco
Copy link
Copy Markdown
Member Author

bdraco commented Apr 12, 2021

These can still be tracked tasks. Since this calls async_create_task we can remove the block that was added in #15803 as it shouldn't be needed anymore, and then the UI will get feedback

@bdraco
Copy link
Copy Markdown
Member Author

bdraco commented Apr 12, 2021

Shutdown can block for a while because we don't cancel setup retries at stop

bash-5.0# ./py-spy dump --pid 213
Process 213: python3 -m homeassistant --config /config
Python v3.8.7 (/usr/local/bin/python3.8)

Thread 213 (idle): "MainThread"
    _wait_for_tstate_lock (threading.py:1027)
    join (threading.py:1011)
    shutdown (concurrent/futures/thread.py:236)
    close (homeassistant/runner.py:92)
    run (asyncio/runners.py:51)
    run (homeassistant/runner.py:126)
    main (homeassistant/__main__.py:313)
    <module> (homeassistant/__main__.py:321)
    _run_code (runpy.py:87)
    _run_module_as_main (runpy.py:194)
Thread 257 (idle): "Thread-8"
    loop (paho/mqtt/client.py:1167)
    loop_forever (paho/mqtt/client.py:1779)
    _thread_main (paho/mqtt/client.py:3452)
    run (threading.py:870)
    _bootstrap_inner (threading.py:932)
    _bootstrap (threading.py:890)
Thread 258 (idle): "Thread-9"
    loop (paho/mqtt/client.py:1167)
    loop_forever (paho/mqtt/client.py:1779)
    _thread_main (paho/mqtt/client.py:3452)
    run (threading.py:870)
    _bootstrap_inner (threading.py:932)
    _bootstrap (threading.py:890)
Thread 268 (idle): "SyncWorker_15"
    _open_socket (websocket/_http.py:176)
    connect (websocket/_http.py:121)
    connect (websocket/_core.py:222)
    create_connection (websocket/_core.py:515)
    open (samsungtvws/remote.py:146)
    try_connect (homeassistant/components/samsungtv/bridge.py:229)
    _try_connect (homeassistant/components/samsungtv/config_flow.py:116)
    run (concurrent/futures/thread.py:57)
    _worker (concurrent/futures/thread.py:80)
    run (threading.py:870)
    _bootstrap_inner (threading.py:932)
    _bootstrap (threading.py:890)
bash-5.0# exit

@bdraco bdraco marked this pull request as ready for review April 12, 2021 20:17
@bdraco
Copy link
Copy Markdown
Member Author

bdraco commented Apr 12, 2021

I'll start a new PR to cancel the ConfigEntryRetry/PlatformNotReady retry at the stop event.

Copy link
Copy Markdown
Member

@balloob balloob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@balloob balloob merged commit 53853f0 into home-assistant:dev Apr 13, 2021
Copy link
Copy Markdown
Member

@MartinHjelmare MartinHjelmare left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question above.

@github-actions github-actions Bot locked and limited conversation to collaborators Apr 14, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants